Overview

Dataset statistics

Number of variables15
Number of observations5121
Missing cells12098
Missing cells (%)15.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory600.2 KiB
Average record size in memory120.0 B

Variable types

Text10
Categorical5

Alerts

country_code is highly overall correlated with country_name and 2 other fieldsHigh correlation
country_name is highly overall correlated with country_code and 2 other fieldsHigh correlation
iso_3166_1_alpha_2 is highly overall correlated with country_code and 2 other fieldsHigh correlation
iso_3166_1_alpha_3 is highly overall correlated with country_code and 2 other fieldsHigh correlation
aggregation_level is highly imbalanced (97.6%)Imbalance
datacommons_id has 1792 (35.0%) missing valuesMissing
locality_code has 5109 (99.8%) missing valuesMissing
locality_name has 5109 (99.8%) missing valuesMissing
location_key has unique valuesUnique

Reproduction

Analysis started2023-09-08 00:00:54.886678
Analysis finished2023-09-08 00:00:56.112108
Duration1.23 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

location_key
Text

UNIQUE 

Distinct5121
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2023-09-08T02:00:56.223216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length10.943175
Min length8

Characters and Unicode

Total characters56040
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5121 ?
Unique (%)100.0%

Sample

1st rowDE_BB_12051
2nd rowDE_BB_12052
3rd rowDE_BB_12053
4th rowDE_BB_12054
5th rowDE_BB_12060
ValueCountFrequency (%)
de_bb_12051 1
 
< 0.1%
de_bb_12052 1
 
< 0.1%
de_bb_12053 1
 
< 0.1%
de_bb_12054 1
 
< 0.1%
de_bb_12060 1
 
< 0.1%
de_bb_12061 1
 
< 0.1%
de_bb_12062 1
 
< 0.1%
de_bb_12063 1
 
< 0.1%
de_bb_12064 1
 
< 0.1%
de_be_11002 1
 
< 0.1%
Other values (5111) 5111
99.8%
2023-09-08T02:00:56.529248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 10242
18.3%
S 4969
 
8.9%
0 4832
 
8.6%
1 4529
 
8.1%
U 3263
 
5.8%
2 2821
 
5.0%
3 2605
 
4.6%
5 2302
 
4.1%
7 2052
 
3.7%
E 1953
 
3.5%
Other values (27) 16472
29.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25278
45.1%
Uppercase Letter 20520
36.6%
Connector Punctuation 10242
18.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4969
24.2%
U 3263
15.9%
E 1953
 
9.5%
T 1695
 
8.3%
C 1477
 
7.2%
N 882
 
4.3%
A 840
 
4.1%
D 807
 
3.9%
M 733
 
3.6%
I 663
 
3.2%
Other values (16) 3238
15.8%
Decimal Number
ValueCountFrequency (%)
0 4832
19.1%
1 4529
17.9%
2 2821
11.2%
3 2605
10.3%
5 2302
9.1%
7 2052
8.1%
4 1767
 
7.0%
8 1715
 
6.8%
9 1582
 
6.3%
6 1073
 
4.2%
Connector Punctuation
ValueCountFrequency (%)
_ 10242
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 35520
63.4%
Latin 20520
36.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4969
24.2%
U 3263
15.9%
E 1953
 
9.5%
T 1695
 
8.3%
C 1477
 
7.2%
N 882
 
4.3%
A 840
 
4.1%
D 807
 
3.9%
M 733
 
3.6%
I 663
 
3.2%
Other values (16) 3238
15.8%
Common
ValueCountFrequency (%)
_ 10242
28.8%
0 4832
13.6%
1 4529
12.8%
2 2821
 
7.9%
3 2605
 
7.3%
5 2302
 
6.5%
7 2052
 
5.8%
4 1767
 
5.0%
8 1715
 
4.8%
9 1582
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 10242
18.3%
S 4969
 
8.9%
0 4832
 
8.6%
1 4529
 
8.1%
U 3263
 
5.8%
2 2821
 
5.0%
3 2605
 
4.6%
5 2302
 
4.1%
7 2052
 
3.7%
E 1953
 
3.5%
Other values (27) 16472
29.4%
Distinct5079
Distinct (%)> 99.9%
Missing41
Missing (%)0.8%
Memory size40.1 KiB
2023-09-08T02:00:56.705310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length27
Mean length27
Min length27

Characters and Unicode

Total characters137160
Distinct characters64
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5078 ?
Unique (%)> 99.9%

Sample

1st rowChIJN8I30-XAqEcRhUxEEOyL_kg
2nd rowChIJX0qVWUJ0CEcROq1_4LUv1FA
3rd rowChIJb_u1AiqYB0cRwDteW0YgIQQ
4th rowChIJt9Y6hM31qEcRm-yqC5j4ZcU
5th rowChIJuRSkBF66qUcRCDglm8hflWE
ValueCountFrequency (%)
chij5tcocraypbircmzhtz37seq 2
 
< 0.1%
chiji_gahcomqucrbukua-romqo 1
 
< 0.1%
chijt9y6hm31qecrm-yqc5j4zcu 1
 
< 0.1%
chijurskbf66qucrcdglm8hflwe 1
 
< 0.1%
chijtzgngtz3b0cryli-iutzpcs 1
 
< 0.1%
chijmxhwdojyp0crvhczxc3nkw4 1
 
< 0.1%
chij74pplnxdqecr1pr8inhoolm 1
 
< 0.1%
chijzwlb3jjwqucrxb06n0k3wgk 1
 
< 0.1%
chijss8a9zdoqecr5htxfi0sg-a 1
 
< 0.1%
chijtuu0ylenb0cr3_w_qw_ineq 1
 
< 0.1%
Other values (5069) 5069
99.8%
2023-09-08T02:00:57.151627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 8734
 
6.4%
R 6860
 
5.0%
h 6779
 
4.9%
C 6482
 
4.7%
J 6427
 
4.7%
c 3085
 
2.2%
Y 2937
 
2.1%
g 2900
 
2.1%
4 2588
 
1.9%
Q 2588
 
1.9%
Other values (54) 87780
64.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 65929
48.1%
Lowercase Letter 50561
36.9%
Decimal Number 17661
 
12.9%
Connector Punctuation 1512
 
1.1%
Dash Punctuation 1497
 
1.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 8734
 
13.2%
R 6860
 
10.4%
C 6482
 
9.8%
J 6427
 
9.7%
Y 2937
 
4.5%
Q 2588
 
3.9%
A 2319
 
3.5%
U 2100
 
3.2%
M 1986
 
3.0%
B 1768
 
2.7%
Other values (16) 23728
36.0%
Lowercase Letter
ValueCountFrequency (%)
h 6779
 
13.4%
c 3085
 
6.1%
g 2900
 
5.7%
o 2567
 
5.1%
w 2253
 
4.5%
k 2220
 
4.4%
p 1920
 
3.8%
s 1816
 
3.6%
u 1680
 
3.3%
x 1679
 
3.3%
Other values (16) 23662
46.8%
Decimal Number
ValueCountFrequency (%)
4 2588
14.7%
0 2223
12.6%
8 1869
10.6%
6 1662
9.4%
1 1579
8.9%
7 1568
8.9%
3 1551
8.8%
9 1550
8.8%
2 1548
8.8%
5 1523
8.6%
Connector Punctuation
ValueCountFrequency (%)
_ 1512
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1497
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 116490
84.9%
Common 20670
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 8734
 
7.5%
R 6860
 
5.9%
h 6779
 
5.8%
C 6482
 
5.6%
J 6427
 
5.5%
c 3085
 
2.6%
Y 2937
 
2.5%
g 2900
 
2.5%
Q 2588
 
2.2%
o 2567
 
2.2%
Other values (42) 67131
57.6%
Common
ValueCountFrequency (%)
4 2588
12.5%
0 2223
10.8%
8 1869
9.0%
6 1662
8.0%
1 1579
7.6%
7 1568
7.6%
3 1551
7.5%
9 1550
7.5%
2 1548
7.5%
5 1523
7.4%
Other values (2) 3009
14.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 137160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 8734
 
6.4%
R 6860
 
5.0%
h 6779
 
4.9%
C 6482
 
4.7%
J 6427
 
4.7%
c 3085
 
2.2%
Y 2937
 
2.1%
g 2900
 
2.1%
4 2588
 
1.9%
Q 2588
 
1.9%
Other values (54) 87780
64.0%
Distinct5097
Distinct (%)> 99.9%
Missing23
Missing (%)0.4%
Memory size40.1 KiB
2023-09-08T02:00:57.415378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.766379
Min length3

Characters and Unicode

Total characters34495
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5096 ?
Unique (%)> 99.9%

Sample

1st rowQ3931
2nd rowQ3214
3rd rowQ4024
4th rowQ1711
5th rowQ6115
ValueCountFrequency (%)
q1492 2
 
< 0.1%
q6178 1
 
< 0.1%
q6115 1
 
< 0.1%
q6173 1
 
< 0.1%
q6152 1
 
< 0.1%
q6139 1
 
< 0.1%
q6181 1
 
< 0.1%
q158893 1
 
< 0.1%
q6119 1
 
< 0.1%
q6125 1
 
< 0.1%
Other values (5087) 5087
99.8%
2023-09-08T02:00:57.762484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Q 5098
14.8%
1 4325
12.5%
4 3637
10.5%
5 3184
9.2%
9 2863
8.3%
8 2775
8.0%
6 2745
8.0%
0 2691
7.8%
2 2495
7.2%
3 2371
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29397
85.2%
Uppercase Letter 5098
 
14.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4325
14.7%
4 3637
12.4%
5 3184
10.8%
9 2863
9.7%
8 2775
9.4%
6 2745
9.3%
0 2691
9.2%
2 2495
8.5%
3 2371
8.1%
7 2311
7.9%
Uppercase Letter
ValueCountFrequency (%)
Q 5098
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29397
85.2%
Latin 5098
 
14.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4325
14.7%
4 3637
12.4%
5 3184
10.8%
9 2863
9.7%
8 2775
9.4%
6 2745
9.3%
0 2691
9.2%
2 2495
8.5%
3 2371
8.1%
7 2311
7.9%
Latin
ValueCountFrequency (%)
Q 5098
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34495
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Q 5098
14.8%
1 4325
12.5%
4 3637
10.5%
5 3184
9.2%
9 2863
8.3%
8 2775
8.0%
6 2745
8.0%
0 2691
7.8%
2 2495
7.2%
3 2371
6.9%

datacommons_id
Text

MISSING 

Distinct3329
Distinct (%)100.0%
Missing1792
Missing (%)35.0%
Memory size40.1 KiB
2023-09-08T02:00:57.939260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length10.93872
Min length9

Characters and Unicode

Total characters36415
Distinct characters40
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3329 ?
Unique (%)100.0%

Sample

1st rowiso/IT-AL
2nd rowiso/IT-AT
3rd rowiso/IT-BI
4th rowiso/IT-CN
5th rowiso/IT-NO
ValueCountFrequency (%)
geoid/44003 1
 
< 0.1%
geoid/28083 1
 
< 0.1%
iso/it-cr 1
 
< 0.1%
iso/it-at 1
 
< 0.1%
iso/it-bi 1
 
< 0.1%
iso/it-cn 1
 
< 0.1%
iso/it-no 1
 
< 0.1%
iso/it-to 1
 
< 0.1%
iso/it-vb 1
 
< 0.1%
iso/it-vc 1
 
< 0.1%
Other values (3319) 3319
99.7%
2023-09-08T02:00:58.219617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 3336
9.2%
o 3329
9.1%
/ 3329
9.1%
g 3226
8.9%
d 3226
8.9%
e 3226
8.9%
1 3049
8.4%
0 3044
8.4%
3 1802
 
4.9%
2 1572
 
4.3%
Other values (30) 7276
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16132
44.3%
Lowercase Letter 13216
36.3%
Uppercase Letter 3635
 
10.0%
Other Punctuation 3329
 
9.1%
Dash Punctuation 103
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 3336
91.8%
T 117
 
3.2%
R 19
 
0.5%
C 17
 
0.5%
P 16
 
0.4%
A 15
 
0.4%
S 14
 
0.4%
V 13
 
0.4%
B 12
 
0.3%
O 11
 
0.3%
Other values (12) 65
 
1.8%
Decimal Number
ValueCountFrequency (%)
1 3049
18.9%
0 3044
18.9%
3 1802
11.2%
2 1572
9.7%
5 1526
9.5%
7 1361
8.4%
9 1175
 
7.3%
4 1162
 
7.2%
8 808
 
5.0%
6 633
 
3.9%
Lowercase Letter
ValueCountFrequency (%)
o 3329
25.2%
g 3226
24.4%
d 3226
24.4%
e 3226
24.4%
i 106
 
0.8%
s 103
 
0.8%
Other Punctuation
ValueCountFrequency (%)
/ 3329
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 103
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19564
53.7%
Latin 16851
46.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 3336
19.8%
o 3329
19.8%
g 3226
19.1%
d 3226
19.1%
e 3226
19.1%
T 117
 
0.7%
i 106
 
0.6%
s 103
 
0.6%
R 19
 
0.1%
C 17
 
0.1%
Other values (18) 146
 
0.9%
Common
ValueCountFrequency (%)
/ 3329
17.0%
1 3049
15.6%
0 3044
15.6%
3 1802
9.2%
2 1572
8.0%
5 1526
7.8%
7 1361
7.0%
9 1175
 
6.0%
4 1162
 
5.9%
8 808
 
4.1%
Other values (2) 736
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36415
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 3336
9.2%
o 3329
9.1%
/ 3329
9.1%
g 3226
8.9%
d 3226
8.9%
e 3226
8.9%
1 3049
8.4%
0 3044
8.4%
3 1802
 
4.9%
2 1572
 
4.3%
Other values (30) 7276
20.0%

country_code
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
US
3228 
ES
1378 
DE
412 
IT
 
103

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters10242
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDE
2nd rowDE
3rd rowDE
4th rowDE
5th rowDE

Common Values

ValueCountFrequency (%)
US 3228
63.0%
ES 1378
26.9%
DE 412
 
8.0%
IT 103
 
2.0%

Length

2023-09-08T02:00:58.334053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-08T02:00:58.418067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
us 3228
63.0%
es 1378
26.9%
de 412
 
8.0%
it 103
 
2.0%

Most occurring characters

ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10242
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10242
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10242
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

country_name
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
United States of America
3228 
Spain
1378 
Germany
412 
Italy
 
103

Length

Max length24
Median length24
Mean length17.137473
Min length5

Characters and Unicode

Total characters87761
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGermany
2nd rowGermany
3rd rowGermany
4th rowGermany
5th rowGermany

Common Values

ValueCountFrequency (%)
United States of America 3228
63.0%
Spain 1378
26.9%
Germany 412
 
8.0%
Italy 103
 
2.0%

Length

2023-09-08T02:00:58.513615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-08T02:00:58.610543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
united 3228
21.8%
states 3228
21.8%
of 3228
21.8%
america 3228
21.8%
spain 1378
9.3%
germany 412
 
2.8%
italy 103
 
0.7%

Most occurring characters

ValueCountFrequency (%)
e 10096
11.5%
t 9787
11.2%
9684
11.0%
a 8349
 
9.5%
i 7834
 
8.9%
n 5018
 
5.7%
S 4606
 
5.2%
r 3640
 
4.1%
m 3640
 
4.1%
U 3228
 
3.7%
Other values (11) 21879
24.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66500
75.8%
Uppercase Letter 11577
 
13.2%
Space Separator 9684
 
11.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10096
15.2%
t 9787
14.7%
a 8349
12.6%
i 7834
11.8%
n 5018
7.5%
r 3640
 
5.5%
m 3640
 
5.5%
c 3228
 
4.9%
o 3228
 
4.9%
f 3228
 
4.9%
Other values (5) 8452
12.7%
Uppercase Letter
ValueCountFrequency (%)
S 4606
39.8%
U 3228
27.9%
A 3228
27.9%
G 412
 
3.6%
I 103
 
0.9%
Space Separator
ValueCountFrequency (%)
9684
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 78077
89.0%
Common 9684
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10096
12.9%
t 9787
12.5%
a 8349
10.7%
i 7834
10.0%
n 5018
 
6.4%
S 4606
 
5.9%
r 3640
 
4.7%
m 3640
 
4.7%
U 3228
 
4.1%
A 3228
 
4.1%
Other values (10) 18651
23.9%
Common
ValueCountFrequency (%)
9684
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 87761
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 10096
11.5%
t 9787
11.2%
9684
11.0%
a 8349
 
9.5%
i 7834
 
8.9%
n 5018
 
5.7%
S 4606
 
5.2%
r 3640
 
4.1%
m 3640
 
4.1%
U 3228
 
3.7%
Other values (11) 21879
24.9%
Distinct89
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2023-09-08T02:00:58.736702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters10242
Distinct characters33
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowBB
2nd rowBB
3rd rowBB
4th rowBB
5th rowBB
ValueCountFrequency (%)
ct 1091
21.3%
tx 254
 
5.0%
md 224
 
4.4%
ga 160
 
3.1%
va 133
 
2.6%
ky 120
 
2.3%
mo 115
 
2.2%
ks 105
 
2.1%
il 102
 
2.0%
nc 100
 
2.0%
Other values (79) 2717
53.1%
2023-09-08T02:00:58.966146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 1576
15.4%
C 1456
14.2%
N 870
 
8.5%
A 821
 
8.0%
M 720
 
7.0%
I 550
 
5.4%
D 391
 
3.8%
O 380
 
3.7%
S 347
 
3.4%
K 331
 
3.2%
Other values (23) 2800
27.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10036
98.0%
Decimal Number 206
 
2.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 1576
15.7%
C 1456
14.5%
N 870
 
8.7%
A 821
 
8.2%
M 720
 
7.2%
I 550
 
5.5%
D 391
 
3.9%
O 380
 
3.8%
S 347
 
3.5%
K 331
 
3.3%
Other values (15) 2594
25.8%
Decimal Number
ValueCountFrequency (%)
2 52
25.2%
5 50
24.3%
7 26
12.6%
8 24
11.7%
4 20
 
9.7%
6 15
 
7.3%
3 11
 
5.3%
1 8
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 10036
98.0%
Common 206
 
2.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 1576
15.7%
C 1456
14.5%
N 870
 
8.7%
A 821
 
8.2%
M 720
 
7.2%
I 550
 
5.5%
D 391
 
3.9%
O 380
 
3.8%
S 347
 
3.5%
K 331
 
3.3%
Other values (15) 2594
25.8%
Common
ValueCountFrequency (%)
2 52
25.2%
5 50
24.3%
7 26
12.6%
8 24
11.7%
4 20
 
9.7%
6 15
 
7.3%
3 11
 
5.3%
1 8
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10242
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 1576
15.4%
C 1456
14.2%
N 870
 
8.5%
A 821
 
8.0%
M 720
 
7.0%
I 550
 
5.4%
D 391
 
3.8%
O 380
 
3.7%
S 347
 
3.4%
K 331
 
3.2%
Other values (23) 2800
27.3%
Distinct91
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2023-09-08T02:00:59.150139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length9.0244093
Min length4

Characters and Unicode

Total characters46214
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowBrandenburg
2nd rowBrandenburg
3rd rowBrandenburg
4th rowBrandenburg
5th rowBrandenburg
ValueCountFrequency (%)
cataluña 1083
 
17.3%
texas 254
 
4.1%
north 206
 
3.3%
comunidad 200
 
3.2%
de 200
 
3.2%
madrid 200
 
3.2%
virginia 188
 
3.0%
georgia 160
 
2.6%
carolina 146
 
2.3%
new 127
 
2.0%
Other values (93) 3497
55.9%
2023-09-08T02:00:59.465931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8342
18.1%
i 3799
 
8.2%
n 2929
 
6.3%
o 2656
 
5.7%
s 2579
 
5.6%
e 2356
 
5.1%
t 2305
 
5.0%
l 2232
 
4.8%
r 2116
 
4.6%
u 1877
 
4.1%
Other values (41) 15023
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 38656
83.6%
Uppercase Letter 6239
 
13.5%
Space Separator 1140
 
2.5%
Dash Punctuation 179
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8342
21.6%
i 3799
9.8%
n 2929
 
7.6%
o 2656
 
6.9%
s 2579
 
6.7%
e 2356
 
6.1%
t 2305
 
6.0%
l 2232
 
5.8%
r 2116
 
5.5%
u 1877
 
4.9%
Other values (16) 7465
19.3%
Uppercase Letter
ValueCountFrequency (%)
C 1665
26.7%
M 727
11.7%
N 445
 
7.1%
I 442
 
7.1%
T 382
 
6.1%
W 286
 
4.6%
K 225
 
3.6%
V 224
 
3.6%
S 219
 
3.5%
A 210
 
3.4%
Other values (13) 1414
22.7%
Space Separator
ValueCountFrequency (%)
1140
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 179
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 44895
97.1%
Common 1319
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8342
18.6%
i 3799
 
8.5%
n 2929
 
6.5%
o 2656
 
5.9%
s 2579
 
5.7%
e 2356
 
5.2%
t 2305
 
5.1%
l 2232
 
5.0%
r 2116
 
4.7%
u 1877
 
4.2%
Other values (39) 13704
30.5%
Common
ValueCountFrequency (%)
1140
86.4%
- 179
 
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45087
97.6%
None 1127
 
2.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8342
18.5%
i 3799
 
8.4%
n 2929
 
6.5%
o 2656
 
5.9%
s 2579
 
5.7%
e 2356
 
5.2%
t 2305
 
5.1%
l 2232
 
5.0%
r 2116
 
4.7%
u 1877
 
4.2%
Other values (39) 13896
30.8%
None
ValueCountFrequency (%)
ñ 1083
96.1%
ü 44
 
3.9%
Distinct4680
Distinct (%)91.6%
Missing12
Missing (%)0.2%
Memory size40.1 KiB
2023-09-08T02:00:59.670591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.9477393
Min length2

Characters and Unicode

Total characters25278
Distinct characters32
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4266 ?
Unique (%)83.5%

Sample

1st row12051
2nd row12052
3rd row12053
4th row12054
5th row12060
ValueCountFrequency (%)
12051 3
 
0.1%
12071 3
 
0.1%
12067 3
 
0.1%
12061 3
 
0.1%
12053 3
 
0.1%
12063 3
 
0.1%
08125 3
 
0.1%
12069 3
 
0.1%
12065 3
 
0.1%
12073 3
 
0.1%
Other values (4670) 5079
99.4%
2023-09-08T02:00:59.980914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4832
19.1%
1 4521
17.9%
2 2769
11.0%
3 2594
10.3%
5 2252
8.9%
7 2026
8.0%
4 1747
 
6.9%
8 1691
 
6.7%
9 1582
 
6.3%
6 1058
 
4.2%
Other values (22) 206
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25072
99.2%
Uppercase Letter 206
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 19
 
9.2%
C 17
 
8.3%
P 16
 
7.8%
A 15
 
7.3%
T 14
 
6.8%
S 14
 
6.8%
V 13
 
6.3%
B 12
 
5.8%
O 11
 
5.3%
M 11
 
5.3%
Other values (12) 64
31.1%
Decimal Number
ValueCountFrequency (%)
0 4832
19.3%
1 4521
18.0%
2 2769
11.0%
3 2594
10.3%
5 2252
9.0%
7 2026
8.1%
4 1747
 
7.0%
8 1691
 
6.7%
9 1582
 
6.3%
6 1058
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common 25072
99.2%
Latin 206
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 19
 
9.2%
C 17
 
8.3%
P 16
 
7.8%
A 15
 
7.3%
T 14
 
6.8%
S 14
 
6.8%
V 13
 
6.3%
B 12
 
5.8%
O 11
 
5.3%
M 11
 
5.3%
Other values (12) 64
31.1%
Common
ValueCountFrequency (%)
0 4832
19.3%
1 4521
18.0%
2 2769
11.0%
3 2594
10.3%
5 2252
9.0%
7 2026
8.1%
4 1747
 
7.0%
8 1691
 
6.7%
9 1582
 
6.3%
6 1058
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25278
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4832
19.1%
1 4521
17.9%
2 2769
11.0%
3 2594
10.3%
5 2252
8.9%
7 2026
8.0%
4 1747
 
6.9%
8 1691
 
6.7%
9 1582
 
6.3%
6 1058
 
4.2%
Other values (22) 206
 
0.8%
Distinct3834
Distinct (%)75.0%
Missing12
Missing (%)0.2%
Memory size40.1 KiB
2023-09-08T02:01:00.209399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length33
Mean length13.225289
Min length3

Characters and Unicode

Total characters67568
Distinct characters81
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3401 ?
Unique (%)66.6%

Sample

1st rowBrandenburg an der Havel
2nd rowCottbus
3rd rowFrankfurt an der Oder
4th rowPotsdam
5th rowBarnim
ValueCountFrequency (%)
county 3008
30.1%
de 334
 
3.3%
la 140
 
1.4%
sant 92
 
0.9%
municipio 78
 
0.8%
del 77
 
0.8%
parish 64
 
0.6%
el 50
 
0.5%
santa 42
 
0.4%
san 40
 
0.4%
Other values (3934) 6078
60.8%
2023-09-08T02:01:00.587827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 6219
 
9.2%
o 5846
 
8.7%
a 5025
 
7.4%
4894
 
7.2%
t 4838
 
7.2%
e 4496
 
6.7%
u 4228
 
6.3%
C 3691
 
5.5%
y 3444
 
5.1%
r 3194
 
4.7%
Other values (71) 21693
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52774
78.1%
Uppercase Letter 9630
 
14.3%
Space Separator 4894
 
7.2%
Dash Punctuation 141
 
0.2%
Other Punctuation 117
 
0.2%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Final Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 6219
11.8%
o 5846
11.1%
a 5025
9.5%
t 4838
9.2%
e 4496
8.5%
u 4228
8.0%
y 3444
 
6.5%
r 3194
 
6.1%
l 3065
 
5.8%
i 2541
 
4.8%
Other values (32) 9878
18.7%
Uppercase Letter
ValueCountFrequency (%)
C 3691
38.3%
S 639
 
6.6%
M 636
 
6.6%
L 494
 
5.1%
P 471
 
4.9%
B 453
 
4.7%
A 345
 
3.6%
G 272
 
2.8%
R 265
 
2.8%
H 264
 
2.7%
Other values (20) 2100
21.8%
Other Punctuation
ValueCountFrequency (%)
' 78
66.7%
. 28
 
23.9%
, 6
 
5.1%
/ 5
 
4.3%
Space Separator
ValueCountFrequency (%)
4894
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 141
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 62404
92.4%
Common 5164
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 6219
 
10.0%
o 5846
 
9.4%
a 5025
 
8.1%
t 4838
 
7.8%
e 4496
 
7.2%
u 4228
 
6.8%
C 3691
 
5.9%
y 3444
 
5.5%
r 3194
 
5.1%
l 3065
 
4.9%
Other values (62) 18358
29.4%
Common
ValueCountFrequency (%)
4894
94.8%
- 141
 
2.7%
' 78
 
1.5%
. 28
 
0.5%
, 6
 
0.1%
) 5
 
0.1%
( 5
 
0.1%
/ 5
 
0.1%
’ 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67068
99.3%
None 498
 
0.7%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 6219
 
9.3%
o 5846
 
8.7%
a 5025
 
7.5%
4894
 
7.3%
t 4838
 
7.2%
e 4496
 
6.7%
u 4228
 
6.3%
C 3691
 
5.5%
y 3444
 
5.1%
r 3194
 
4.8%
Other values (50) 21193
31.6%
None
ValueCountFrequency (%)
à 109
21.9%
ó 74
14.9%
í 73
14.7%
è 42
 
8.4%
ü 39
 
7.8%
ç 27
 
5.4%
é 25
 
5.0%
ö 18
 
3.6%
ñ 18
 
3.6%
á 17
 
3.4%
Other values (10) 56
11.2%
Punctuation
ValueCountFrequency (%)
’ 2
100.0%

locality_code
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing5109
Missing (%)99.8%
Memory size40.1 KiB
2023-09-08T02:01:00.709146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters36
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st rowACE
2nd rowFUE
3rd rowGMZ
4th rowLPA
5th rowSPC
ValueCountFrequency (%)
ace 1
8.3%
fue 1
8.3%
gmz 1
8.3%
lpa 1
8.3%
spc 1
8.3%
tfn 1
8.3%
vde 1
8.3%
bcn 1
8.3%
mad 1
8.3%
sfo 1
8.3%
Other values (2) 2
16.7%
2023-09-08T02:01:00.953379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 4
11.1%
C 4
11.1%
E 3
 
8.3%
F 3
 
8.3%
N 3
 
8.3%
D 2
 
5.6%
T 2
 
5.6%
S 2
 
5.6%
P 2
 
5.6%
L 2
 
5.6%
Other values (8) 9
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 36
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 4
11.1%
C 4
11.1%
E 3
 
8.3%
F 3
 
8.3%
N 3
 
8.3%
D 2
 
5.6%
T 2
 
5.6%
S 2
 
5.6%
P 2
 
5.6%
L 2
 
5.6%
Other values (8) 9
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 4
11.1%
C 4
11.1%
E 3
 
8.3%
F 3
 
8.3%
N 3
 
8.3%
D 2
 
5.6%
T 2
 
5.6%
S 2
 
5.6%
P 2
 
5.6%
L 2
 
5.6%
Other values (8) 9
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 4
11.1%
C 4
11.1%
E 3
 
8.3%
F 3
 
8.3%
N 3
 
8.3%
D 2
 
5.6%
T 2
 
5.6%
S 2
 
5.6%
P 2
 
5.6%
L 2
 
5.6%
Other values (8) 9
25.0%

locality_name
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing5109
Missing (%)99.8%
Memory size40.1 KiB
2023-09-08T02:01:01.062005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length9.6666667
Min length6

Characters and Unicode

Total characters116
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st rowLanzarote
2nd rowFuerteventura
3rd rowLa Gomera
4th rowGran Canaria
5th rowLa Palma
ValueCountFrequency (%)
la 2
 
10.5%
lanzarote 1
 
5.3%
barcelona 1
 
5.3%
york 1
 
5.3%
new 1
 
5.3%
atlanta 1
 
5.3%
francisco 1
 
5.3%
san 1
 
5.3%
madrid 1
 
5.3%
hierro 1
 
5.3%
Other values (8) 8
42.1%
2023-09-08T02:01:01.287268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19
16.4%
r 13
11.2%
e 11
 
9.5%
n 9
 
7.8%
7
 
6.0%
o 6
 
5.2%
t 6
 
5.2%
i 6
 
5.2%
l 4
 
3.4%
c 3
 
2.6%
Other values (24) 32
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 90
77.6%
Uppercase Letter 19
 
16.4%
Space Separator 7
 
6.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19
21.1%
r 13
14.4%
e 11
12.2%
n 9
10.0%
o 6
 
6.7%
t 6
 
6.7%
i 6
 
6.7%
l 4
 
4.4%
c 3
 
3.3%
d 2
 
2.2%
Other values (9) 11
12.2%
Uppercase Letter
ValueCountFrequency (%)
L 3
15.8%
C 2
10.5%
F 2
10.5%
G 2
10.5%
Y 1
 
5.3%
N 1
 
5.3%
A 1
 
5.3%
S 1
 
5.3%
B 1
 
5.3%
M 1
 
5.3%
Other values (4) 4
21.1%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 109
94.0%
Common 7
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19
17.4%
r 13
11.9%
e 11
 
10.1%
n 9
 
8.3%
o 6
 
5.5%
t 6
 
5.5%
i 6
 
5.5%
l 4
 
3.7%
c 3
 
2.8%
L 3
 
2.8%
Other values (23) 29
26.6%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 19
16.4%
r 13
11.2%
e 11
 
9.5%
n 9
 
7.8%
7
 
6.0%
o 6
 
5.2%
t 6
 
5.2%
i 6
 
5.2%
l 4
 
3.4%
c 3
 
2.6%
Other values (24) 32
27.6%

iso_3166_1_alpha_2
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
US
3228 
ES
1378 
DE
412 
IT
 
103

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters10242
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDE
2nd rowDE
3rd rowDE
4th rowDE
5th rowDE

Common Values

ValueCountFrequency (%)
US 3228
63.0%
ES 1378
26.9%
DE 412
 
8.0%
IT 103
 
2.0%

Length

2023-09-08T02:01:01.399585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-08T02:01:01.485373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
us 3228
63.0%
es 1378
26.9%
de 412
 
8.0%
it 103
 
2.0%

Most occurring characters

ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10242
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10242
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10242
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4606
45.0%
U 3228
31.5%
E 1790
 
17.5%
D 412
 
4.0%
I 103
 
1.0%
T 103
 
1.0%

iso_3166_1_alpha_3
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
USA
3228 
ESP
1378 
DEU
412 
ITA
 
103

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters15363
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDEU
2nd rowDEU
3rd rowDEU
4th rowDEU
5th rowDEU

Common Values

ValueCountFrequency (%)
USA 3228
63.0%
ESP 1378
26.9%
DEU 412
 
8.0%
ITA 103
 
2.0%

Length

2023-09-08T02:01:01.579342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-08T02:01:01.667067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
usa 3228
63.0%
esp 1378
26.9%
deu 412
 
8.0%
ita 103
 
2.0%

Most occurring characters

ValueCountFrequency (%)
S 4606
30.0%
U 3640
23.7%
A 3331
21.7%
E 1790
 
11.7%
P 1378
 
9.0%
D 412
 
2.7%
I 103
 
0.7%
T 103
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15363
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4606
30.0%
U 3640
23.7%
A 3331
21.7%
E 1790
 
11.7%
P 1378
 
9.0%
D 412
 
2.7%
I 103
 
0.7%
T 103
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 15363
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4606
30.0%
U 3640
23.7%
A 3331
21.7%
E 1790
 
11.7%
P 1378
 
9.0%
D 412
 
2.7%
I 103
 
0.7%
T 103
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15363
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4606
30.0%
U 3640
23.7%
A 3331
21.7%
E 1790
 
11.7%
P 1378
 
9.0%
D 412
 
2.7%
I 103
 
0.7%
T 103
 
0.7%

aggregation_level
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2
5109 
3
 
12

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters5121
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 5109
99.8%
3 12
 
0.2%

Length

2023-09-08T02:01:01.763287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-08T02:01:01.843782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 5109
99.8%
3 12
 
0.2%

Most occurring characters

ValueCountFrequency (%)
2 5109
99.8%
3 12
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5121
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5109
99.8%
3 12
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 5121
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5109
99.8%
3 12
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5121
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 5109
99.8%
3 12
 
0.2%

Correlations

2023-09-08T02:01:01.902583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
country_codecountry_nameiso_3166_1_alpha_2iso_3166_1_alpha_3aggregation_level
country_code1.0001.0001.0001.0000.047
country_name1.0001.0001.0001.0000.047
iso_3166_1_alpha_21.0001.0001.0001.0000.047
iso_3166_1_alpha_31.0001.0001.0001.0000.047
aggregation_level0.0470.0470.0470.0471.000

Missing values

2023-09-08T02:00:55.681274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-08T02:00:55.873298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-09-08T02:00:56.026887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

location_keyplace_idwikidata_iddatacommons_idcountry_codecountry_namesubregion1_codesubregion1_namesubregion2_codesubregion2_namelocality_codelocality_nameiso_3166_1_alpha_2iso_3166_1_alpha_3aggregation_level
0DE_BB_12051ChIJN8I30-XAqEcRhUxEEOyL_kgQ3931NaNDEGermanyBBBrandenburg12051Brandenburg an der HavelNaNNaNDEDEU2
1DE_BB_12052ChIJX0qVWUJ0CEcROq1_4LUv1FAQ3214NaNDEGermanyBBBrandenburg12052CottbusNaNNaNDEDEU2
2DE_BB_12053ChIJb_u1AiqYB0cRwDteW0YgIQQQ4024NaNDEGermanyBBBrandenburg12053Frankfurt an der OderNaNNaNDEDEU2
3DE_BB_12054ChIJt9Y6hM31qEcRm-yqC5j4ZcUQ1711NaNDEGermanyBBBrandenburg12054PotsdamNaNNaNDEDEU2
4DE_BB_12060ChIJuRSkBF66qUcRCDglm8hflWEQ6115NaNDEGermanyBBBrandenburg12060BarnimNaNNaNDEDEU2
5DE_BB_12061ChIJTZGnGtz3B0cRyLI-iUtZPCsQ6173NaNDEGermanyBBBrandenburg12061Dahme-SpreewaldNaNNaNDEDEU2
6DE_BB_12062ChIJmxhwdojyp0cRVHczxC3NkW4Q6152NaNDEGermanyBBBrandenburg12062Elbe-ElsterNaNNaNDEDEU2
7DE_BB_12063ChIJ74pPLNXdqEcR1Pr8iNHoOLMQ6139NaNDEGermanyBBBrandenburg12063HavellandNaNNaNDEDEU2
8DE_BB_12064ChIJzWlb3JjWqUcRxb06n0K3WgkQ6181NaNDEGermanyBBBrandenburg12064Märkisch-OderlandNaNNaNDEDEU2
9DE_BB_12065ChIJI_gaHCoMqUcRbuKUa-ROMqoQ6119NaNDEGermanyBBBrandenburg12065OberhavelNaNNaNDEDEU2
location_keyplace_idwikidata_iddatacommons_idcountry_codecountry_namesubregion1_codesubregion1_namesubregion2_codesubregion2_namelocality_codelocality_nameiso_3166_1_alpha_2iso_3166_1_alpha_3aggregation_level
5111US_WY_56027ChIJU2Z7QKeaY4cR0e7UZeDrIbsQ485641geoId/56027USUnited States of AmericaWYWyoming56027Niobrara CountyNaNNaNUSUSA2
5112US_WY_56029ChIJHazPIJZeRVMREOKSf3g0edYQ156385geoId/56029USUnited States of AmericaWYWyoming56029Park CountyNaNNaNUSUSA2
5113US_WY_56031ChIJOSq_r9fXZYcROriCFM_N0S4Q490512geoId/56031USUnited States of AmericaWYWyoming56031Platte CountyNaNNaNUSUSA2
5114US_WY_56033ChIJvzJgFeONNVMRdBzqIEJBwJ0Q490522geoId/56033USUnited States of AmericaWYWyoming56033Sheridan CountyNaNNaNUSUSA2
5115US_WY_56035ChIJjXutE5u9V4cRjNUtUPgMhnsQ490494geoId/56035USUnited States of AmericaWYWyoming56035Sublette CountyNaNNaNUSUSA2
5116US_WY_56037ChIJEVZf4rRTWocRoqATF0f_rnAQ484194geoId/56037USUnited States of AmericaWYWyoming56037Sweetwater CountyNaNNaNUSUSA2
5117US_WY_56039ChIJV3wGpVVrUlMR3m18oGf5rvkQ488912geoId/56039USUnited States of AmericaWYWyoming56039Teton CountyNaNNaNUSUSA2
5118US_WY_56041ChIJR4w4T5CnUYcRvTPRgMmXumMQ483973geoId/56041USUnited States of AmericaWYWyoming56041Uinta CountyNaNNaNUSUSA2
5119US_WY_56043ChIJIXR5_L9BS1MRLvs8KUxeZC8Q112846geoId/56043USUnited States of AmericaWYWyoming56043Washakie CountyNaNNaNUSUSA2
5120US_WY_56045ChIJd4Rqhed3YocR7ubT5-HgoJgQ115413geoId/56045USUnited States of AmericaWYWyoming56045Weston CountyNaNNaNUSUSA2